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STRUCTURAL PROPERTIES OF PROPORTIONAL FAIRNESS: 
STABILITY AND INSENSITIVITY 

By Laurent Massoulie 

Thomson Paris Research Lab 

In this article we provide a novel characterization of the propor- 
tionally fair bandwidth allocation of network capacities, in terms of 
the Fenchel-Legendre transform of the network capacity region. We 
use this characterization to prove stability (i.e., ergodicity) of network 
dynamics under proportionally fair sharing, by exhibiting a suitable 
Lyapunov function. Our stability result extends previously known re- 
sults to a more general model including Markovian users routing. In 
particular, it implies that the stability condition previously known 
under exponential service time distributions remains valid under so- 
called phase- type service time distributions. 

We then exhibit a modification of proportional fairness, which 
coincides with it in some asymptotic sense, is reversible (and thus 
insensitive), and has explicit stationary distribution. Finally we show 
that the stationary distributions under modified proportional fair- 
ness and balanced fairness, a sharing criterion proposed because of 
its insensitivity properties, admit the same large deviations charac- 
teristics. 

These results show that proportional fairness is an attractive band- 
width allocation criterion, combining the desirable properties of ease 
of implementation with performance and insensitivity. 

1. Introduction. The abstract network bandwidth allocation (NBA) prob- 
lem can be formulated as follows. A network supports connections of distinct 
types, indexed by r, the index r spanning the set of types TZ, assumed finite. 
Given the number Xr of users of each type r ^TZ, with Xr G N, the problem 
is to determine the total capacity allocated to type r users, denoted be Ar, 
with \r G M_|_ . The quantity represents the rate at which data is received 
collectively by all users of type r. The allocation vector A := {Arjre'R. is 
constrained to lie in a set C C M^' . 
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Fig. 1. Example of a two-link network supporting three types of users. 



The set C is a suitable abstraction of all the physical capacity constraints 
of the actual network under consideration. An example of a two-link network 
is represented in Figure 1. This network supports three types of users, data 
destined to type-1 users going through link 1 only, data to type-2 users going 
through link 2 only, while data for type-3 users goes through the two links. 
Thus, when the two links have unit capacities, the corresponding set C is 
given by {A € : Aj + A3 < 1, i = 1,2}. This example can be extended to 
the case where the network consists of an arbitrary number of links, and 
user types r are characterized by collections of links used by data destined 
to them. Denoting by C the collection of links, and by q the capacity of link 
i C, the corresponding network capacity region then takes the form 

C = i A G : ^ AirXr <ceJec\, 

I rGTZ J 

where Air equals 1 or according to whether type-r users require capacity 
at link i or not. Such types of capacity constraints have been considered for 
instance in [13, 14], as suitable models of wired networks with fixed routing 
such as the Internet, the matrix A then reflecting the route that data of 
users of given type follows through the network. More general polyhedral 
capacity sets C arise when users of a given type r can send data along several 
distinct routes through the network. Yet more general, nonpolyhedral (albeit 
still convex) capacity sets can adequately model the impact of interferences 
between data transmissions of distinct types in wireless networks; see [4] for 
such examples. 

In the present work we only require the set C to be convex, and nonincreas- 
ing, that is to say, for any two vectors A, A' G such that A,. < A^,r G TZ, 
then A belongs to C whenever A' does. These two assumptions are met in all 
the examples above mentioned. 

Mo and Walrand [19] introduced the following criterion for determin- 
ing the allocation vector A. Given weights Wr > 0, and a parameter a >0, 
the so-called ('u;,a)-fair allocation vector is the solution to the optimization 
problem 

(1) max XrUr{K/Xr), 
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where 



(2) 



Ur{y) 




Wr log(y) 



if a 7^ 1, 
if a = 1. 



This parametric family of ahocation criteria contains the so-cahed propor- 
tional fairness criterion, introduced by Kelly [13], which corresponds to the 
special case a = l and Wr = 1- In the limit a oo, the {w, a)-fair allocation 
coincides with the so-called max-min fair allocation (see [2] for a definition). 

The rationale for proportional fairness, as explained in [13], lies in the 
following desirable decomposition property. Assume that the ultimate goal 
of bandwidth allocation is to maximize the sum of utility functions, Ur, of 
the rates Xr/xr allocated to users of class r, exactly as in equation (1), these 
utility functions being known to the users but not to the network. Then the 
decomposition result of [13] states that this can be done by letting on the 
one hand the network allocate bandwidth according to proportional fair- 
ness, with weights Wr specified by the network users, and on the other hand 
the network users selecting these weights Wr appropriately, given their (pri- 
vately known) utility functions Ur, and the network allocation in response 
to distinct weights Wr- 

Alternatively, the unweighted proportional fairness allocation arises natu- 
rally from results in bargaining theory, in contrast to the above justifications 
based on microeconomic theory. Indeed, the results of Stefanescu and Ste- 
fanescu [27] (see also [17] for further discussion) imply that it is the only 
allocation of bandwidth that satisfies four natural axioms introduced by 
Nash [20] (namely, invariance with respect to affine utility transformations, 
Pareto optimality, independence of irrelevant alternatives, and symmetry), 
assuming that users' utility is a linear function of the rate they receive. (Note 
the difference with the previous microeconomic framework, which allowed 
arbitrary concave utility functions.) It is in fact the natural extension of 
Nash's bargaining solution, originally derived in the special context of two 
users, to an arbitrary number of users. 

The rationales for candidate NBA solutions we have just reviewed orig- 
inate from microeconomic theory of utility, and game (bargaining) theory, 
and assume a static set of network users. There is another line of approach 
to the NBA problem, which is essentially motivated by performance issues 
in a dynamic setting. 

Specifically, assume that network users arrive and leave the system, the 
arrivals of type r users being at the instants of a Poisson process of rate I'r- 
Assume further that users remain in the system until they have transferred 
a file of a given size, files associated with type r users being exponentially 
distributed with parameter /i,.. The state variable x = {xr}r£TZ is then a 
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Markov process, with nonzero transition rates 

Xr- — > + 1 with rate f^j 

(3) 

Xr Xr — 1 with rate fir\-- 

A suitable rationale for selecting a NBA is to guarantee desirable properties 
of the above Markov process. One such property is stability (or equivalently, 
ergodicity), as it in turn implies that sojourn times of users are almost surely 
finite. Ergodicity cannot be guaranteed for all sets of traffic parameters Ur, 
fir and network capacity sets C. In particular, letting pr := I'r/fJ'r denote the 
load brought by type r-users, when the vector p = {pr}r<=Ti does not belong 
to the capacity set C, the process cannot be ergodic (for a proof, see, e.g., [4]). 
When p is on the boundary of C, Kelly and Williams [15] have established 
that the process cannot be positive recurrent, for sets C corresponding to 
wired networks with fixed routes. Their proof extends to the case of general 
convex nonincreasing capacity sets C with minor modifications. 

A reasonable performance requirement is thus that, provided the traffic 

o 

intensity vector p lies in the interior C of C, then the above Markov process is 
ergodic. Such a property is in fact satisfied for all {w, a)-fair bandwidth allo- 
cation criteria, as follows from the Lyapunov function-based stability proof 
of Bonald and Massoulie [3] (see also de Veciana, Lee and Konstantopou- 
los [10] who first established the result for the case of max-min fairness. 
Ye [29] and Key and Massoulie [16] for an extension to more general utility 
functions Ur in the allocation definition (1)). 

Thus, the requirement of achieving ergodicity for the largest possible set 
of traffic intensity vectors p, being met by all {w,a)-ia.ir NBA, does not 
distinguish one such criterion as superior to the others. 

A more stringent requirement has been suggested by Bonald and Proutiere 
[5], namely that not only the stability region (defined to be the set of vec- 
tors p such that the system is ergodic) be maximal, but also that the cor- 
responding Markov process be insensitive to the distribution of sizes of the 
files transferred by each class of users. Roughly speaking, insensitivity means 
that the stationary distribution of the numbers of users in the system is un- 
affected if the service time distributions are modified, provided their mean 
is left unchanged. For characterizations of insensitive systems, we refer the 
reader to Schassberger [25] and references therein. In particular, it holds 
that, when service rate to users of one type is shared equally among such 
users, that is to say, under a processor sharing assumption, reversibility of 
the original Markov process ensures it is insensitive [5, 25]. 

If insensitivity holds, the system remains ergodic under the natural stabil- 
ity condition for arbitrary phase-type (i.e., mixtures of convolutions of expo- 
nential distributions; see, [1], page 80), not necessarily exponential, service 
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time distributions. Note that ergodicity under the natural stabihty condi- 
tions, and for general, nonexponential service time distributions, has so far 
been established for the max-min fairness NBA in a recent article of Bram- 
son [6], but a similar result has been missing for all other (w,a) NBAs. 
Although restricted to max-min fairness, the results of Bramson apply un- 
der very weak integrability assumptions on the service time distributions, 
and are not restricted to phase- type distributions. 

Bonald and Proutiere have identified a new NBA, the so-called balanced 
fairness allocation, which meets the two requirements of maximal stabil- 
ity region and insensitivity, and moreover maximizes the fraction of time 
during which the system is empty, among all allocations meeting these two 
requirements. They have furthermore identified special network topologies 
for which balanced fairness coincides with proportional fairness, and have 
shown that for all other network topologies, balanced fairness is distinct 
from any utility maximization NBA. 

This leaves several questions open regarding the choice of an NBA. On 
the one hand, utility maximization allocations, such as {w, a)-fairness or 
more specifically proportional fairness, can be implemented in a distributed 
manner (see, e.g., the seminal paper by Kelly, Maulloo and Tan [14]), and 
are motivated by microeconomic theory and game theory arguments in a 
static setting. In addition, they satisfy the criterion of maximal stability 
region in the dynamic setting, but do not seem to meet the more stringent 
requirement of insensitivity. On the other hand, balanced fairness does meet 
the latter requirement, but no simple distributed technique for realizing this 
NBA is known, if we except the special network topologies, identified in [5], 
where it coincides with proportional fairness. 

In the present work, we provide a novel characterization of proportional 
fairness, and use it to improve upon this unsatisfactory state of affairs. In- 
deed, relying on this structural property, we show that the seemingly for- 
tuitous coincidence of balanced fairness and proportional fairness on spe- 
cific network topologies in fact refiects a deeper relationship between the 
two NBAs, that holds for any network topology as captured by the set C. 
More precisely, we exhibit a third NBA, namely modified proportional fair- 
ness, which coincides in some asymptotic sense with proportional fairness. 
Under modified proportional fairness, the system is reversible, and hence 
insensitive. Furthermore, the steady state distributions under modified pro- 
portional fairness and balanced fairness admit the same large deviations 
characteristics, described by a simple explicit rate function. 

As a by-product, we give a new proof of ergodicity of proportional fair- 
ness, which extends to a more general model of network dynamics including 
Markovian users routing. This in turn implies that the usual stability condi- 
tions still hold with service time distributions that are of phase type rather 
than exponential. 
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In view of these results, proportional fairness is an attractive candidate 
as a default NBA. Indeed, it is motivated by the following factors: (1) the 
decomposition property of [13], (2) axiomatic arguments from bargaining 
theory [27], (3) as an implementable approximation to balanced fairness, 
meeting the additional criteria of performance and insensitivity. 

The structure of the paper is as follows. Section 2 gives the novel charac- 
terization of proportional fairness. Stability properties with Markovian user 
routing are proven in Section 3. The special case of phase type service dis- 
tributions is discussed in Section 4. Section 5 establishes the relationships 
between balanced fairness and modified proportional fairness, and in partic- 
ular the fact that the corresponding equilibrium distributions have the same 
large deviations characteristics. 

2. Characterization of proportional fairness via convex duality. It is con- 
venient to consider the logarithms of the allocated capacities , rather than 
the Ar themselves. Denote by K the subset of M'^' in which these must lie, 
that is, 

7 = {-fr} eK^\ = {exp(7^)} G C. 

Given 7, 7' in K, and e G (0,1), by convexity of the exponential function, 
for all r G 7^, one has 

exp(e7r. + 6)7',) < eexp(7r) + (1 - e) exp(7^), 

and thus since C is convex, nonincreasing, then so is K. Denote by 7^^(x) 
the vector of logarithms of proportionally fair allocations, that necessarily 
belong to K. 

Denote by 6k the function that equals zero on K, and -|-oo outside of K. 
The original characterization of A^^(x) as a maximizer of ^r&Ti^i'^og{\r) 
over A € C readily implies that 

(4) G argsup((7, j;) - 6k{i))- 

Let now 5*j^ denote the Fenchel— Legendre convex conjugate function of 
that is, 

5k{x) = sup ((7,0;) - 5k{i))- 

Recall that the subgradient of a convex function J defined on at a point 
X G R", which is denoted by dJ{x), is the set consisting of all the vectors h 
such that, for all y G M", 



J{x) + {h,y-x)<J{y). 
We then have the following compact characterization of the function 7 
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where d5*^{x) denotes the subgradient of the convex function d"^ at x. 

Proof. It follows from Theorem 23.5, page 218 in Rockafellar [22] that 
conditions (4) and (5) are equivalent for any proper convex function (5|^. 
Recall that a convex function is proper if it nowhere takes the value — oo, 
and it takes finite values at some points. Both conditions hold for 6^, which 
establishes the lemma. □ 

This simple result allows to use the powerful theory of convex duality in 
the study of the function x — > 7^^(x). For instance, we have the following: 

Lemma 2. The function (5|^ is continuously differentiable on (0, oo)^, 
and thus on (0, oo)^, j^^{x) coincides with the ordinary gradient of 5*^^ at 
X, and depends continuously on x. 

Proof. By Theorem 25.1, page 242 in [22], at a point x where a convex 
function admits a unique subgradient, it is differentiable, and its subgradient 
reduces to its ordinary gradient. The original allocation vector A^^(x) is 
uniquely defined at x whenever > for all r G 7^, by strict concavity of 
the log function. Thus, 7^^(x) is also uniquely defined at x e (0,oo)^, and 
hence it coincides with the ordinary gradient of 5*^ at x. 

Furthermore, by Theorem 25.5, page 246 in [22], the gradient of a proper 
convex function is continuous on the domain where the function is differen- 
tiable. The claimed continuity of the allocation vector 7^^(x) on x G (0, oo)^ 
follows. □ 

Introduce now the alternative NBA, denoted PF' for modified propor- 
tional fairness, and defined by 



where pr = Vrl Pr^ r € 7?.. It is readily verified that, under the PF' allocation 
strategy, the Markov process is reversible, and thus insensitive. Indeed, one 




if Xt > 0, 
otherwise. 




L{x) = 6*j^{x) - log{pr)x, 




8 



L. MASSOULIE 



verifies the detailed balance equations 

The natural stability condition is, as discussed previously, the following: 

(8) p£C. 

The following lemma gives useful properties satisfied by function L: 

Lemma 3. The function L is lower semicontinuous on M^, and contin- 
uous on M^. Furthermore, under assumption (8), there exist positive con- 
stants a,A>0 such that for all x G MJ, 

(9) a\\x\\oo<L{x) <A\\x\\oo, 
where 11 

Proof. The function 6^ is lower semicontinuous, as the Fenchel-Legendre 
conjugate of a proper convex function (by Theorem 12.2, page 104 in [22]). 
The sum of an affine — and hence continuous — function with a lower semicon- 
tinuous function is lower semicontinuous. Thus L is lower semicontinuous. 

Continuity of L on follows from Theorem 2.35, page 59 in Rockafellar 
and Wets [23] and the fact that it is convex, lower semicontinuous, and finite 
on Mj. 

Under the stability condition (8), there exists some e > such that (1 + 
e)p G C. Thus, 

S*K{x)>Y,^rlogi{l + e)pr). 

It follows that 

L{x)>log{l + e)J2xr- 

This provides the first inequality in (9). In order to establish the second 
inequality, use the homogeneity property of L to write 

L{x) = \\x\\ooL{\\x\\^x) <\\x\\oo sup L(y). 

yeK!^,||t/||oo=i 

The supremum of a continuous function on a compact set is finite, which 
yields the second half of (9). □ 

It follows from equation (9) that, under condition (8), the stationary 
measure (7) can be normalized to a probability measure, which then implies 
stability (ergodicity) of the Markov process under the modified proportional 
fairness NBA, when (8) holds. 
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Fix now y € MJ, and let x = ny, where n is large. The heuristic calculation 

X^^'(x) = exp(n(5^(y) - nS^iy - n~^er)) 
^exp{drSK{y)) 

based on the homogeneity property of 6^, according to which 5*^{ny) = 
n5'^{y), and a heuristic Taylor approximation, suggests that the behavior of 
the systems under PF and PF' are similar, at least far from the origin. At 
this stage we content ourselves with making the following conjecture: 

Conjecture 1. Let X^^ and X^^ denote the number of customers in 
steady state under PF and PF', respectively. We conjecture that the rescaled 
vectors n~^X^^ and n~^X^^ satisfy, as oo, a large deviations principle 
with the same rate function L as defined in (6) . 

Remark that the vector of allocations X^^' (x) belongs to the convex set 
C for all X G Zj, in view of the following property of the function 6'^: 

Lemma 4. The function is such that, for all x G , and all > 0, 
r G7^, 

(;L0) f 6*j^{x)-5*Kix-erer) \ ^ 



It is understood in this expression that a vector u with coordinates in {— oo}U 
M belongs to K when the vector e" with coordinates e^^ belongs to the original 
convex set C , and e~°° = 0. 

Proof. Let x G M^. Assume first that all the coordinates strictly 
positive. It follows that the vector u achieving the supremum in the original 
definition of S^{x) is uniquely defined. By Lemma 1 above, and Theorem 
25.1, page 242 in [22], it follows that is differentiable at x, its (ordinary) 
gradient being the vector u achieving that supremum. Also, the function 
e — > e~^[6'^{x) — S'^{x — ee^)] is nonincreasing in e > 0, and achieves its 
maximum as e \ 0, where it equals the coordinate Ur of the gradient (see 
Theorem 23.1, pages 213-214 in [22]). By monotonicity of the set K, it 
follows that d'^ satisfies the condition (10) at x. 

We now show that the same is true when some coordinates of x equal 
zero. Let I <ZTZ denote the set of indices r for which the coordinate Xr 
equals zero. We say that x belongs to the face I when x G and x,. = if 
and only if r G /. We also denote by Kj the subset of K consisting of these 
vectors u with Ur = —oo if and only if r G /. In the definition of 6j^{x), we 
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may actually replace the optimization domain by Kj rather than K. There 
is then a single vector u of Kj which achieves the corresponding supremum. 
We may conclude as in the previous case that (10) holds in the present case 
as well. □ 

3. Stability properties of proportional fairness. The above characteriza- 
tion is now applied to the study of stability properties of the Markov process 
describing the number of users in the system under proportional fairness. 
Ergodicity is established by following the general approach of fluid limits, 
introduced in the contexts of more traditional queueing systems by Rybko 
and Stolyar [24] and Dai [9]. 

The section is organized as follows. The general model with Markovian 
users routing is first introduced. A characterization of the fluid limits of 
this process is then given. It is next established that the function L defined 
in (6) is a Lyapunov function for these fluid limits, from which stability (or 
equivalently, ergodicity) of the original Markov process is deduced, under 
condition (8) for suitably defined loads pr, r gTZ. 

The model with Markovian users routing is as follows. As before, users are 
of different types, r E 7^. External arrivals of type r users are according to a 
Poisson process with intensity F^! the service times of type-r users are again 
exponential with parameter /i,.. However, after completing service, type r 
users will re-enter the system as type s users with some probability prs- 
Thus the nonzero transition rates are now given by 

X ^ X I 
(11) X —>■ X — Cr + Cs 

X ^ X G-f 

In the above, Sr denotes the rth unit vector in M'^. It is assumed that 
the matrix P = {prs)r,se'Jl is substochastic, and that its spectral radius is 
strictly less than 1. Thus, there exists a unique vector i/ = {vr)reTl solving 
the traffic equations 

sen 

also written in matrix form 

{I-P^)i^ = V, 

where is the transposition of the routing probability matrix P. Introduce 
the notation pr = Vr/ Pr-, and p= {pr)ren- The main result of this section is 
the following: 



with rate 

with rate pr)^^ {x)prs, 

with rate pr)\.^ {x) 1 — ^ prs ■ 
\ sen I 
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Theorem 1. The Markov process with Markovian users routing is er- 
godic under condition (8). 

In order to establish the theorem, a characterization of the fluid limits of 
the original Markov process is required. To this end, the following definition 
will be used. Note that the constant A appearing in this definition differs 
from the one appearing in Lemma 3. In the sequel, to simplify notations, A 
will always be used to denote an arbitrary finite constant, whose value may 
vary from one statement to another. 

Definition 1. The functions :M+ ^ M+, r £TZ, are called fluid tra- 
jectories of the system with Markovian users routing if there exist non- 
decreasing, Lipschitz continuous functions Dr :M_|_ M_|_, r £TZ, such that 
Dr{0) = 0, admitting A as a Lipschitz constant for any A such that C C 
[0, A]^, that verify 

(12) Xr{t) = XriO) +Vrt - firDr{t) + ^PsrfJ-sDs{t), teR+,r ^TZ, 

sen 



and for almost every t € M+, all r G 7^, the derivatives Dr{t) exist and verify 

(13) Drit) e 0,limsupAj;'^(j 

(14) xr{t) > ^ Drit) = X7{x{t)) = exp(7P^(x(t))). 



The following notation will be used in the sequel. For any x G M.^, S{x) 
denotes the set of all fluid trajectories of the system with initial condition 
x. Thus it is a subset of C{[0,+oo),R^), that is the space of continuous, 
M^-valued functions on [0,+oo). 

Note that at this stage neither existence nor uniqueness of fluid trajecto- 
ries with a given initial condition have been established. 

The following result is the first step of the proof of Theorem 1. It implies 
as a corollary that the set S{x) is nonempty, for any x € R!^. However no 
claim of uniqueness of fluid trajectories is made. 

Theorem 2. Consider a sequence of initial conditions X^{0) = (X^'(0))re7^, 
k>l, such that for a sequence of positive numbers {zk)keN, linife_^oo-2fc = 
+00, and the limit limfc_^oo ^^'"^^'^(0) = a^(0) exists in M^. 

Then for all T > 0, and all e > 0, the following convergence takes place: 



z^'X>'{zkt)-f{t)\>e^ 



lim P inf sup \zpx''{zkt)- f{t)\>e] =0. 



,/G5(x{0))te[o,T] 

In words, the restriction of the rescaled process z'^^X^^z^-) to any compact 
interval [0,r] converges in probability to the set S{x{0)) of fluid trajectories 
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with initial condition x{0), where convergence of processes is for the uniform 
norm. 



The proof of Theorem 2 is deferred to the Appendix A. We expect a similar 
result to hold for other NBA, in particular for a-fair NBA, provided one 
replaces A^^ by the corresponding allocation vector A'^^'^ in the definition 
of the fluid trajectories. Indeed the proof given in the Appendix A relies on 
two technical lemmas by Ye, Ou and Yuan [30] which apply to general a-fair 
NBA, and the rest of the proof can be adapted in a straightforward manner. 

The second step of the proof of Theorem 1 consists in establishing a 
suitable uniform convergence to zero of fluid trajectories: 



Theorem 3. Under the stability condition (8), there exists r > and 
e>0 such that, for any fluid trajectory {x{t)}t^^^ , provided L(x(0)) = 1, 
then L{x{t)) < 1 — e. 



The following lemma will be needed in the proof of Theorem 3: 



Lemma 5. Let he a fluid trajectory as per Definition 1. For 

every i > 0, let I{t) denote the set of indices r gTZ such that Xr{t) = 0, and 

m = n\i{t). 

(i) There exist modified arrival rates Dr, r o.'^'dl modified routing 
probabilities, Prs, r,s £ I{t), that depend only on the set I{t), such that the 
matrix {Prs)r s£l{t) sub-stochastic with spectral radius strictly less than 1, 
the identity 

(15) K),gj(,) = (/-P^)-ip 
holds, and furthermore, for almost every t>0, 

^Xr{t) = i>r-\- ^ HsPsr>^^^{x{t)) - ljLrXr^{x{t)), T e I{t), 

r&I{t) 

(16) ^ 

-x(i) = 0, r€l(t). 
dt 

Let f{t):=L{x{t)). 

(ii) For almost every t > 0, it holds that: 

(17) hmsup ^^^ + ^j"^^^^ < i^r^i^t)) - logipr))Xr{t), 

^ rem 



where the derivatives Xr{t) are as in (16). 
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(iii) There exists a constant A such that, for all t> 0, 

(18) l,msup«i±4^<A 

h\0 

The proof of the lemma is given in Appendix B. 
The following auxiliary result will also be used: 

Lemma 6. Let a continuous function f:[0,T] be given. Assume 

that there exists e € M such that, for almost all t € [0,T]; 

(19) limsup <—£■ 

h\o h 

Assume further the existence of a constant AgM such that for all t G [0, T], 

, . . f{t + h)-f(t) 

(20) lim sup ■'^ ' — — < A. 

h\o IT- 

Then it holds that, for all s,t E [0,T), s < t, 

(21) f^t)-f{s)<-eit-s). 

Remark 1 . The following example illustrates the role of assumption (20) 
in Lemma 6. Let f^{t) = m{[0,t]), where m is the uniform measure on the 
Cantor set obtained by successive exclusion of the middle third from the 
interval [0,1] (see, e.g., Falconer [11] for background). More precisely, this 
measure can be defined by specifying the mass it puts on intervals [0,2;] 
where a; is a triadic number, that is 



X = 
1=1 



where Zi S {0, 1, 2}, i > 1. The uniform measure m on this Cantor set is then 
specified by 



m{[0,x])=J2za~'~\ 

i=l 

where k = min{i > 1 : Zj = 1}. 
Define then 

f{t) = -et + f+{t). 

The function / is continuous, because the measure m has no atoms. More- 
over, the measure m is supported by a set of null Lebesgue measure, so that 
the function / satisfies condition (19) of Lemma 6. However, the conclu- 
sion (21) does not hold, precisely because condition (20) is not satisfied. 
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The result of Theorem 3 is estabhshed as follows. 



Proof of Theorem 3. Let {x(t)}jgK^ denote a fluid trajectory. In- 
troduce the notation Ur = log{X^^ {x{t)) / pr) . The right-hand side of equa- 
tion (17), which we shall denote h{t), can then be rewritten, in view of (16), 
as 



h{t) = ^ Ur 

relit) 



sei(t) 



or equivalently, in matrix form, 

/i(t) = (n,z>-(I-P^)(z^e")). 

We use the notation \ri\ to denote the diagonal matrix with diagonal entries 
provided by the coordinates of the vector rj. Elementary manipulations entail 
that 



(22) 



Mt) = -(n,(/-P^)M(e«-l)) 
= -(^, \{I-P)u\{e--l)) 
= -{D,{I-Pr'\{I-P)u\{e^-l)), 



the first equality relying on identity (15). In order to show that the previous 
expression is nonpositive, it is enough to show that for each r G /(t), the 
coefhcient of v^- is nonpositive, that is. 



(23) Fr{u):=Y, 

n>Os&I{t) 



Us- Y Pseue 
ieiit) 



>0, rG/(t). 



The following lemma, whose proof is deferred to the Appendix D, is now 
needed: 

Lemma 7. For any substochastic matrix P = {prs)rs£i with spectral ra- 
dius strictly less than 1, and any real numbers Us, s ^ I , then: 

(i) Inequality (23) holds. 

(ii) The function as defined in (23) verifies Fr{u) > Ft.{u'^) for all 
u € M+, where u'^ := {uf)g^j, and uf = max(us,0). 

(iii) There is equality in (23) only if for all states s such that J2n>o'Prs^ > 
0, one has u^ = 0. 



That the term h{t) is nonpositive follows from Lemma 7(i). 
When x{t) ^ 0, the allocation vector A^^(3;(t)) must lie on the external 
boundary of the capacity set C. Thus, by (8), for some positive e, there must 
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exist some coordinate r such that {x{t)) > (1 -\-e)pr. Therefore, setting 
6 = log(l + e) > 0, it holds that Us>S for some s € /. There must also exist 

(n) 

some r G / such that > 0, and J2n>oPrs > 0. It is also the case that the 
Uk are bounded from above by some constant A, since the allocations A^^ 
are bounded from above. 

By Lemma 7(ii), one thus has 



Since the function Fr is continuous and the set S is compact, the infimum of 
Fr{u) over S is attained; however it cannot be zero, in view of Lemma 7(iii) 
and the definition of S. Thus, the right-hand side of the above is less than 
—£{I{t)) for some strictly positive e{I{t)) that depends only on the set I{t). 

By assumption, the initial condition of the fluid trajectory in the state- 
ment of Theorem 2 is such that L{x{0)) = 1. 

Thus, in view of (9), there exists r so that 3:^(0) > 1/K for some finite 
positive constant K. Setting r = 1/{2KA), where A is such that the capacity 
set C is a subset of [0,A]^, it then follows that for any fluid trajectory 
with initial condition x{0) such that L{x{0)) = 1, then x{t) ^ on [0, r]. 
Hence, by the previous evaluations, in view of (17,18) for any such fluid 
trajectory, the function f{t) = L{x{t)) satisfles the assumptions of Lemma 6, 
with e := inf/c7^,/^7^£(-^) > 0. Thus, by Lemma 6: 



The claim of the theorem follows. □ 

The proof of Theorem 1 will require to combine Theorems 2, 3 and the 
following ergodicity criterion, which is a direct consequence of Theorem 8.13, 
page 224 in Robert [21]: 

Theorem 4 ([21]). Let X{t) be a Markov jump process on a countable 
state space S. Assume there exists a function L:S^M^ and constants A, 
e, and an integrable stopping time f > such that for all x ^S: 



x{t) ^O^h{t) <- inf Urini Fr{u) 



r:i>r>0 ueS 



where the set S is defined as 




L{x{t)) <1-Te <1. 



(24) 



L{x) >A^ E,L(X(f )) < L{x) - eE,(f). 



If in addition the set {x:L{x) < A} is finite, and Ea;L(X(l)) < +oo for all 
rr € 5, then the process X{t) is ergodic. 
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Proof of Theorem 1. Let r be as in Theorem 3. Consider the deter- 
ministic stopping time f = L(x(0))r. Denote by the probability distri- 
bution of the Markov process {X{t)) with initial condition x G N^. 

It is readily seen that the collection of probability distributions 

{P,.(L(x)-iX,(f) G .)} ren,xeN'^\{o} 

is uniformly integrable. Indeed, let Ar denote independent unit rate Poisson 
processes, used to generate users arrival times. Then the process X{t) can 
be generated so that 

(25) Xr{t)<Xr{0)+Ar{iyrt), t > 0, r € 7^. 

Thus, for X(0) = x, 

Xrir) ^ Xr Arjl^rLix)) 

L{x) - L{x) L{x) 

The first term is bounded from above uniformly in x 7^ 0, in view of Lemma 3, (9). 
The second term has mean 1. Its variance equals Ur/L[x). Thus the sec- 
ond moments of these variables are uniformly bounded in x 7^ 0. Therefore, 
Lavallee-Poussin criterion for uniform integrability applies. 

In view of (9), it then follows that the collection of probability distribu- 
tions 

{V,{L{x)-'L{X{f)) G •)} 

is also uniformly integrable. 

The result of Theorem 2 entails that for any sequence of initial condi- 
tions x^ such that 11 

||oo — ^ 00 as h — > 00, the corresponding rescaled vari- 
ables X^ {L(x^)t) / L{x^) converge in probability to the set V defined as 

V:= U {x(r),x(.)G5(x)}. 

a:GMj,L{a:)=l 

In words, V is the set of states of fluid trajectories at time r for all fluid 
trajectories with initial condition x(0) satisfying L(x(0)) = 1. 

It can be verified from (9) and the definition of fluid trajectories that the 
set V is compact. Continuity of L together with compactness of V entail 
that the sequence of random variables L{X^ {L{x^)t)) / L{x^) converges in 
probability to the set L{V). 

Thus, by Theorem 3, the sequence of random variables L{X^ {L{x^)t)) / L{x^) 
converges in probability to the interval [0, 1 — e], where e > 0. 

Combined with the uniform integrability just shown, this yields 

limsup -^E^L(X(L(x)t)) < 1 - e. 
L{x)^oo L[x) 
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Thus, Condition (24) of Theorem 4 holds for A sufficiently large. The second 
requirement, that the set {x:L{x) < A} be finite, follows from (9). Finally, 
the last condition, that is 'ExL{X{l)) < +oo for all x is easily verified, in- 
voking once more the bounds (25) and (9). □ 

4. Application to phase-type service distributions. We now apply Theo- 
rem 1 to systems with general phase-type distributions rather than exponen- 
tial service time distributions. More precisely, we consider the same setting 
as before, with user classes r and capacity set C C MJ. New type r 
users arrive as usual according to a Poisson process with intensity i^r- 

The service time distribution of type r customers is now defined as follows. 
A finite set Ir, referred to as the set of service phases, is given. The total 
service time is characterized as the aggregation of service times required in 
subsequent visits to phases. At each visit to phase i, a corresponding service 
time that is exponentially distributed, with parameter is required. A 
visit to phase i is followed by a visit to phase j with probability pr-ij ■ A prob- 
ability distribution {ai}i^i^ on Ir specifies the phase in which service starts. 
The transition matrix Pj. := {pr;ij)i,jGlr is assumed to be sub-stochastic, with 
spectral radius strictly less than 1. It is easily checked that the above de- 
scription is equivalent to the definition of phase-type distributions given in 
[1], page 83. ^ 

Denote by TZ the set of pairs (r, i) with r £TZ and i & 1^.. For all (r, i) G TZ, 
let Xr^i denote the number of class r users who are currently in phase i of 
their service. 

The process (^r-,i)(r j)g75 then a Markov process of the kind covered by 
Theorem 1. More precisely, it corresponds to the following parameters. For 
the class s = (r, i) € TZ, the external arrival rate Vg is given by i'rCir,i and the 
corresponding service time parameter is = fJ'r,i- For two classes s = {r,i), 
s' = {r',i'), the corresponding routing probability p^g' is zero if r 7^ r', and 
otherwise equals Pr-w- Finally, the capacity set C is determined from the 
original capacity set C as follows. The allocation vector (As)^^^ belongs to 

C if and only if the allocation vector {Xr)r<=:TZ belongs to C, where Xr is given 

by Eie/, ^{r,i)- 

We then have the following: 

Theorem 5. The process tracking the numbers x^ of users of class r, 
under proportionally fair allocation of resources characterized by the set C, 
assuming Poisson arrivals and phase type distributions as just described, is 
ergodic under the usual condition (8), where pr = I'rCr, o,nd is the mean 
service time for class r users. 
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Proof. By Theorem 1, ergodicity holds provided the vector {p(r,i))^j. j)g-^ 
belongs to the interior of C. Equivalently, it holds if the vector with rth co- 

o 

ordinate J^iei^ P{r,i) belongs to C- 

With the specific routing probability matrix P obtained from the charac- 
teristics of the phase type service distributions, one has 

P''-''- jeIrn>Q 
/^'■'^ j&lrn>0 

This in turn implies that 

Noting that in the above expression, the last sum over j G Ir and n > gives 
the average number of visits to phase i in a class r service time, it readily 
follows that this last expression coincides with i/rCTr, which completes the 
proof. □ 

5. Relationships between balanced fairness and proportional fairness. In 

this section we define the balanced fairness NBA, give an equivalent char- 
acterization and then use it to relate the stationary distributions under 
balanced fairness and modified proportional fairness. 

The balanced fairness NBA, introduced in [5], is best defined in terms 
of the balance function. The balance function, denoted ifj, is defined by 
induction on Zj, starting from ip{0) = 1, il^{x) = for any x not in MJ, and 

tpix) = inf{a > : {a~^^{x — er)}ren G C}, 

where is the rth unit vector in M^. The balanced fairness rate allocation 
vector A^^ is then defined as 

ip^x) 

As for proportional fairness, it is convenient to consider the logarithms 7^ 
of the allocated capacities A^, rather than the A^ themselves. Denote by 
7^^(x) the vector of logarithms of balanced fair allocations, that must lie in 
the convex nonincreasing set K. Introduce the notation (/>(x) = — logi{j{x). 
Thus one has 

jf^ix) = <j){x) - (j){x - Cr). 
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In other words, the vector 7 (x) is given by the increments of the function (p 
at X, and can be seen as an approximate gradient of at x. We introduce the 
notation V df{x) = {/(x) — f{x — er.)}rg7e. Note that a stationary measure 
for the Markov process counting users of ah types is given in terms of the 
function (f) by 

(26) 7ri^^(x) = I exp ly-ct>{x) + ^ log(p,)^ 

for some normahzation constant Z. A consequence of the reversibihty prop- 
erty of the Markov process is that this measure is also stationary for the 
modified Markov process with Markovian routing [5]. 
We now give an alternative definition of (j). 



Lemma 8. The function cj) admits the following characterization: 

(t){x) = sup{/(x)} 

where J- is the set of functions defined on such that /(O) = 0, f{y) = +00 
for y ^ Z?, and V df{y) belongs to K for all y G Zj. 



Proof. Denote by (j){x) the result of the optimization problem in the 
right-hand side of the above expression. Proceed by induction on x G 
to show that (j){x) = (j){x). Clearly, 0(0) = 0(0) = 0. Also, as the function (j) 
satisfies the conditions over which the optimization is performed, necessarily 
one has that 4>{x) < </>(x), for all x € Zj. Assume thus that 0(y) = 0(y) for 
all y < X, y 7^ X. The definition by induction of ip implies that 

0(x) = sup{a : {a — (j){x — er-)}re7e S K}- 

On the other hand, for any / satisfying the assumptions, 

/(x) < sup{a : {a - f{x - e,.)}re7e e K} 
< sup{a : {a — 0(x — er)}r&'R. G K} 
= 0(x). 

We have used for the first inequality the definition of the constraints satisfied 
by /, for the second we have used the induction hypothesis that f{x — er) < 
(j){x — Cr) together with monotonicity of the set K, and the last equality is 
just the inductive definition of 0. □ 



We are now ready to establish the following: 
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Theorem 6. For any x € Z^, the following inequalities hold: 
(27) 5*K{x)<(^{x)<5*K{x)+r{x), 
where 



Xr -j 



777 

Proof. The two inequalities shall be established by induction on X^reT^ 
for X G Zj. They obviously hold true for x = 0, as (^(0) = (5^(0) = 0. Assume 
thus that they hold for all y E such that Y^r^nVr ^ '^^ foi' some integer 
n > 0, and let x G be given, X]re7^ ^r- = ?t- + 1- By the induction hypothesis 
and the result of Lemma 8, it holds that 

(j){x) = sup{a : {a — ^(x — er)}r&l ^ 

> sup{a : {a — 5*j^{x — er)}re7e E K}. 
Now, in view of Lemma 4, it holds that 

{5*K{x)-5*K{x-er)]r<^n^K. 

Therefore, 

0(x)>5J,(x), 

and the first inequality in (27) is established. 

By the induction hypothesis again, we have that 

(28) (j){x) < sup{a : {a - 5*j^{x - e^) - r(x - es)}sa'!i G K}. 

Consider first the case where > for all s ^IZ. We shall rely on the 
following lemma, the proof of which will be given after the end of the current 
proof. 



Lemma 9. For all x,h G M^, such that x has strictly positive coordi- 
nates, and X + h has nonnegative coordinates, it holds that 

(29) 6Ux + h)< 5^(x) + {h, 7^^(x)) + E - • 

sen 

Thus, in view of the previous equation, we have that 
5*Ax-es)<S*K{x)-jJ^ix) + ^- 

Xg 

Combining this upper bound with (28), as the vector {-)^^ {x))s£n is in K, 
we have that 

(t){x) < 5]^{x) + sup|r(x - e^) + — 
sen I Xg 
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In view of the definition of r{y), the second term in the right-hand side is 
clearly upper bounded by r{x), which establishes the desired inequality for 

X. 

To conclude the proof, it remains to deal with the case where some co- 
ordinates Xs equal zero. This case is in fact similar to the previous one: if 
X belongs to face / (i.e., = if and only if s G /), the previous argument 
carries over in Z , by considering the convex set Kj instead of K. □ 



Proof of Lemma 9. Let x,he M. be fixed, such that x has strictly 
positive coordinates, and x + h has nonnegative coordinates. Let 7 € be 
such that 

6k{x) = {x,-i) - 5k{i)- 

The pair (^,7) verifies the relations x € dSxij), 7 G d6'^{x). In addition, 
the following one-to-one correspondence between subgradients of 6c and 6k 
can be established: 

x G d6Kh) ^ {x,e-''^}sen G d6c{{e^'}s&'R.)- 

Let /i G be fixed. We have that 

6k{x + h)= sup {{g, x + h) - 6K{g)} 



sup {(n + 7, X + /i) - + 7)} 

^k{^) + (^)7) + sup {(n,x + +6k{i) - 6k{i + u)}. 



However, by convexity of 5c, and recalling that e '''x € d6c{e'^)-, we have the 
following inequality: 

5K{i + u) = 6c{e'^+n 

><5c(eT) + (eT+"-eT,e-Tx) 

Combined with the previous expression for 6*j^{x + /i), this yields 
(5^(x + /i)<(5^(x) + (/i,7)+ sup \ E n,(x, + /i,) - (e"^ - l)xJ 

= (5^(x) + (/i,7) + E {xs + hs)\og{l + hsl Xs) -hs 
sen-.xs+hsyo 

+ Yl 

sen:xs+hs=0 
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The claimed inequality (29) now follows by noting that (i) log(l + hg/xg) < 
/is/xs, and (ii) 7 = 7^^(x). □ 

A simple consequence of the theorem is the following. 

Corollary 1. For any x € M+, it holds that 

lim —(f)(nx) = 6*f^(x). 

n— >oo 77, 

Proof. This follows trivially since the function 6^ is positively homoge- 
neous, that is, 5|^(nx) = n5|^(x), and since the remainder term r{nx) in (27) 
is of order log(n), and a fortiori o{n). □ 

Note that, in view of (26), 

- log(7r"^ (nx)) = \- y^Xr log(pr) • 

The last term must go to zero as n — > oo. This together with Corollary 1 
yield the following: 

Corollary 2. The stationary distribution tt^^ as in (26) admits the 
following large deviations asymptotics: 

(30) lim ilog7r^''(nx) = -L(x), x G MJ, 

n — ^oo 77 ^ 

where L is the Lyapunov function (6) used in the study of stability properties 
of proportional fairness. It thus admits the same large deviations character- 
istics as the stationary distribution (7) of the system under PP sharing. 

Remark 2. The result of Theorem 6 also implies that, if for all x € , 
there exists a limit lim„^oo A^^(nx) of the allocation vector under balanced 
fairness, then it must coincide with A^^(x). So far we have not been able 
to establish the existence of such a limit, except in the special case where 
\TZ\ = 2, although it seems plausible that the limit exists more generally. 

APPENDIX A: PROOF OF THEOREM 2 

We argue by contradiction, assuming that for some e > 0, and for all k in 
an infinite subsequence of the original sequence, it holds that 

(31) Pf inf sup \z^'x'{zkt) - f{t)\ >e)>e. 

In the rest of the proof, without loss of generality we assume that the above 
evaluation holds true for all k >1. 
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The trajectories of the processes X^(t) can be represented exphcitly in 
terms of independent unit rate Poisson 
s eTZ, as follows: 



sen ^ 

-Y,Di(f,rPrs [\^''iX\u))0 



du 



(n)) du 



This imphes the following, by a change of variables in the integrals, and 
using the fact that XF^ {ax) = XF^ {x) for all scalar a > 0: 



-X^.{zkt) = -X^.{S))+Vrt + J2 l^sPsr f AfF(z,-lX^(zfcn)) 
Zk Zk Jo 



du 



^ir \^/{z^^X\zku))du + e^.{t), 



where the error term £^.{t) verifies, for all T > 0, 

sup \e';{t)\<— sup \Al{zkt) - Zkt\ 
te[o,T] Zk telo,urT] 

+ ^ — sup \D'^riZkt) - Zkt\ 

sen ^'^ te[o,AtsPsrAT] 

+ ^ — sup \D';^{zkt) - Zkt\ 
sen te[o,tirPTsAT] 

+ — sup \D^{zkt) - Zkt\. 
Zk te[o,^lrAT] 

In these expressions, ^ is a constant such that C C [0, A]^. 

The following large deviations bound on the maximal deviation of a unit 
rate Poisson process from its mean is now needed: 

Lemma A.l. Let E. be a unit rate Poisson process. Then for all T > 0, 
and all X> 0, it holds that 

(32) pf sup |H(t)-t|>AT') <e-^^W+e-™(-^), 

\0<t<T / 

where 

(33) /i(A):=(l + A)log(l + A)-A 
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is the Cramer transform of a unit mean, centered Poisson random variable. 
In the above formula, it is understood that h{—X) = +00 if X> 1. 

This result and its proof are standard (see, e.g., [26]). It implies the ex- 
istence of a subsequence k{£),i > 1 of the original sequence, and a sequence 
e{i) decreasing to zero, such that 

Vpf sup |e^(^)(t) I <oo, ren. 

i>i Vte[o,T] / 

Indeed, it can be checked from the definition of £r{t) and Lemma A.l that 
the sum in the left-hand side is finite for the particular choice 

A;(l) = l, 

k{£)=mm{k>k{e-l):zk>i}, i>l, 
e{i)=i-y^, i>l. 

Without loss of generality we again assume that finiteness of the sum holds 
true for the original sequence k>l. Thus, by Borel-Cantelli's lemma, almost 
surely, sup^gjQ^-] |e{?(t)| — > as A; — > 00. The following variation on Arzela- 
Ascoli's theorem will then be used to proceed: 

Lemma A. 2 (Lemma 6.3, [30]). Suppose that a sequence of functions 
fk '■ [0, T] ^ M has the following properties: 

(i) {/fc(0)}fc>o is bounded; 

(ii) there is a constant M > 0, and a sequence of positive numbers ak, 
with as A; —> 00, such that 

\fk(.t)-fk{s)\<M{t-s) + ak, k>0,s,te[0,T]. 

Then the sequence admits a subsequence that converges uniformly on [0,T] 
to a Lipschitz continuous function / : [0,T] — > M with Lipschitz constant M . 

In the present setup, this result guarantees that for any T > 0, with prob- 
ability 1, for any subsequence of the original sequence A; > 1, there exists a 
further subsequence, denoted k' , along which, for all r £TZ, the following 
convergences take place, uniformly on [0,T]: 

f X^^'iz^^X'^'izk'u)) du ^ Dr{t), 
Jo 

Z^^X^ (Zk't) ^Xr{t) :=x(0) + Vrt+ ^PsrflsDs{t) - flrDr{t), 

sen 

where the functions Dr are yl-Lipschitz. (A set in which any infinite sequence 
admits a convergent subsequence is usually called sequentially compact. 
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Sequential compactness is equivalent to compactness in the case of metric 
spaces.) We shall now establish that all such limits are fluid trajectories of 
the system. To this end, the following lemma, also taken from [30], Lemma 
6.2(b), is needed: 

Lemma A. 3. For all r gTZ, and any x G such that Xr > 0, the band- 
width allocation function A,^^ is continuous at x. 

In fact. Ye, Ou and Yuan establish this result in the context of particular, 
polyhedral capacity sets C; however their proof applies more generally to 
the current context of convex, nonincreasing sets C. We do not reproduce it 
here. 

Let then f be a point at which all functions Dr are differentiable. By 
Rademacher's theorem, this holds for almost every t € [0,T]. Consider first 
the case where Xr{t) > 0. One then has, for all h> 0: 

t+h i-t+h 

X^^iz^^X'' {zk'u))du^ / X^^{x{u))du, 
t Jt 

in view of (i) Lipschitz continuity of u ^ x{u), which entails positivity of 
Xj.{u) on [t,t + h], the continuity property of A^^ given in Lemma A. 3, 
and finally by an application of Lebesgue's dominated convergence theorem. 
Therefore, appealing once more to Lemma A. 3, the derivative of function 
Dr at u must coincide with X^^{x(t)). 

Consider now the case where Xr{t) = 0. Clearly, by Fatou's lemma, for all 
h> 0, one has 



ft+h , rt+h 

limsup / X^^ (z'l^^X'' (zk'u)) du < / lim sup X^^ (y) du. 

fc'— >00 Jt Jt y-^x{u) 



On the other hand, the function x limsupj^^^ X^ iu) is upper semi-continuous, 
and thus it follows that 



lim sup 

u—*t 



lim sup A^^(y) 



< limsup A^^(y). 



This readily implies that, necessarily, the derivative of n — > Dr{u) at t must 
lie in the interval [0, limsupj^_>2.(^) X^^{y)]. 

We have thus shown that for any interval [0,T], with probability 1, from 
any subsequence one can extract a further subsequence k' along which the 
rescaled process z'^,^X k{zk'-) converges to a fluid trajectory, uniformly on 
[0,T]. That is to say, almost surely, the accumulation points of the rescaled 
trajectories consist only of fluid trajectories. This is in contradiction with 
(31), and the result of Theorem 2 then follows. 
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APPENDIX B: PROOF OF LEMMA 5 

Proof of part (i). Let u x{u) denote a fluid trajectory. Let t > be 
a point at which all the associated functions Dj. are differentiable. For no- 
tational convenience, write x for / for /(t), / for TZ\I, and for 

X^^{x{t)). For r G /, Theorem 2 establishes that Dr{t) = K- Let dr denote 
the derivative Dr{t) for r € /. As the trajectories u — s- Xr{u) are constrained 
to be nonnegative, necessarily one has for all r ^ I: 

Xr{t) = = 17^ + ^ AisPsr As + ^ fJ-sPsrds - f^rdr- 
s0 sel 

This can be written in matrix form as 

{^Ld)j=Vj + (P^),7(/iA)7 + 

where denotes the vector with entries j G J, and {M)ij denotes the 
matrix with entries Mjj, i ^ I, j £ J. The matrix {P^)jj has a spectral 
radius strictly less than 1, for otherwise the original routing matrix P would 
have a spectral radius of at least 1. Thus there exists a unique solution (//d)/ 
to the above equation, given by 

(/id), = (/ - + (P^)//(/xA)j]. 

In view of this expression, for r € /, the time derivatives Xr{t) can be written 

as 

(x)j = (i7)j + iP^)jj{i^X)r + {P^)jji/^d)j 
= (v)j + {p^)n{i-{p^)ny% 

+ [iP^)n + - iP^)nr\P^)jj]if,X)j 

= D + P'^{nX)i, 
where we have introduced the notation 

D={v)j+{p^)jj{i-{p^)jjr'vj, 

P'' = {P'^Yu + (P^hiii - {P'')u)-\P'^)ri- 

The modified routing probability p^-s can be interpreted as the probability 
that, in a Markov chain on TZ started at r I, evolving according to the 
original routing probabilities pij (which may become absorbed outside the 
set TZ), the next visit to the set / is precisely to state s. That is to say, p 
capture the transition probability in the original chain, after removing all 
excursions to the set /. This interpretation allows to deduce at once that the 
modified routing probability matrix P is sub-stochastic and with spectral 
radius strictly less than 1 whenever P is so. 
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It remains to establish the identity 

Again, this can be estabhshed from a probabihstic interpretation. Assume 
without loss of generality (by joint rescaling) that the vector I7's entries sum 
to 1. Then can be interpreted as the average number of visits to state r in 
the Markov chain, with transition probabilities p, assuming that the initial 
distribution is specified by V. It is readily verified that D then represents 
the distribution of the first visit to / which is also the initial distribution of 
the chain where excursions to / are removed. The mean number of visits to 
states r G / is the same with or without removal of excursions into /, hence 
the desired identity holds. 



Proof of part (ii). Let us establish (17). In view of Lemma 1, one has 
(34) L{x{t + h))-L{x{t)) = Y^ / [jf^{f{u))-log{pr)]dn, 

„_1 Jxr-(t) 



where the vector (u) is defined as 

{Xs{t + h), s<r, 
u, s = r, 

Xs{t), s>r. 

At a point t where the fluid trajectories are differentiable, one thus has, 
by the continuity of functions at x{t) for r G I, which follows from 
Lemma A. 3, 



1 fXrit+h) 

(35) hm- / [jf^{y^-{u))-log{pu)]du = Xr{t)bf''{xit))-log{pr)]. 

h~*o n Jx^(t) 



^ rXrit+h) 

lxr{t) 

For r € / and /i > 0, one has the evaluation 

rxAt+h) ^pp^^,^^^^ ^ ^^^p (7PF(y))(x,(t + h)- xr{t)). 



Indeed, this holds because Xr{t + h) — Xr{t) > 0, which holds in turn because 
Xr{t) = for r I, and Xr{t + /i) > 0. This inequality entails that 



\ rxr{t+h) 

limsup - / '^^.^ {y'' [u)) du < sup {-if.^ {y))xr{t) = 0, 

h\0 JXr{t) y(l^n 



\ r-Xr(t+h) 

h\0 h Jxr{t) 

where boundedness from above of 7^^ has been used. This inequality, to- 
gether with (35) and (34) estabhsh (17). 
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Proof of part (iii). We finally prove (18). To this end, we establish up- 
per bounds on each of the terms in the right-hand side of (34). Note that, 
for all t > 0, and r E I{x{t)), the functions Dr are differentiable at t, with 
derivative X^^{x{t)). Also, for all s gTZ, the functions Dg are nondecreasing 
and Lipschitz with some constant A. It thus follows that, for r G I: 

1 fXr{t+h) 

limsup- / [7,. {y' (u)) -log{pr)]du 

h\0 n, Jxr(t) 

< sup (j^^iy) - log(y9,))(l7,, + \n\A) 

The first two terms in the right-hand side are bounded from above, and the 
last term is uniformly bounded, since X^^{x{t)) = exp(7^^(x(t))) and the 
function u — > ue^ is bounded on a range (—00,^]. 

It remains to consider the case where r £ I. One then has 



Xr{t+h) 

[jf''{y^{u))-log{pr)]dn 

Xr(t) 

< sup {-i^^ {y) -\og{pr))[Xr{t + h) - Xr{t)] 

< sup (7P^(y)-log(p,))/iA, 



+ 



where A is a Lipschitz constant for u Xr{u), and nonnegativity of the fluid 
trajectories has been used. These last two upper bounds together combine 
to give (18). 

APPENDIX C: PROOF OF LEMMA 6 

It follows from Proposition 3, page 21 in [8] (see also [18]) that a contin- 
uous function / verifying assumption (20) is such that 

(36) s<t^f{t)-f{s)<{t-s)A. 

Define now the increasing variation Vf'{t) as the supremum over partitions 
To = < ri < • • • < Tm = i of the sum 

m— 1 



$:(/(r.+i)-/(r,))- 



In view of (36), it follows that 

Vj+{t)<At, te[0,T]. 
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It is easily shown that for all s < t € [0,T], one has 

n-l 

V+(t)-V;is)= sup 5:(/(r,+i)-/(r,))+, 

TO=S<---<T„=t 

where the supremum is taken over ah finite partitions tq = s < • ■ • < r„ = t. 
This together with (36) imphes that 

< V+{t) - V+{s) <A{t-s), s<te [0,T]. 
Moreover, if one defines Vj^ (t) as 

vf{t) ■.= v;{t)-f{t), 

one readily sees that u — > Vj^ (u) is a nondecreasing function. One may as- 
sociate a nonnegative measure fi" on [0,T] to Vj^ by setting 

^-{[0,t]) = Vf{t+)-Vf{0). 

By Radon-Nykodim's theorem, this measure can further be decomposed into 
a measure that is absolutely continuous with respect to Lebesgue measure, 
whose density we shall denote by g~{t), and into a measure u~ that is 
supported by a set F of null Lebesgue measure. 

By Rademacher's theorem, the Lipschitz-continuous function Vj^ is al- 
most everywhere differentiable; denote its derivative by g~^{t). Thus, the 
function / is differentiable almost everywhere, with derivative g^{t) — g^{t). 
Moreover, by condition (19), for almost every t, it holds that 

g+{t)-g~{t)<-e. 

To conclude, for s <t <T, write 

fit) - fis) < j\9^{u) - g-{u)) du - u-{{s,t)) 

<-e{t-s), 
which is the announced result. 



APPENDIX D: PROOF OF LEMMA 7 



Proof. Using the notation = max(0, itx), note that the factor of 
Prs^ in (23) reads 



(e"» - 1) 



[(e"» - 1)+ - (e"= - 1)-] X 
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leTZ 



1) - XI P'^f-'^t 



ten 



1) ^Psiuj. 

ten 



In order to obtain the above expansion, we have used the fact that (e"" — = 0. 

Note that the last two terms in this expansion are nonnegative. Note also 
that the first two terms both read 



for adequate choices of Vs, namely Vs = for the first term, and Vs = —u~ 
for the second term. This establishes claim (ii) of the lemma. This further 
implies that, in order to prove claims (i) and (iii) of the lemma, it is sufficient 
to restrict attention to the case where the Ug all have the same sign, which 
we now assume. 

Introduce the notation 



Let us now show that this last condition is satisfied. Note first that it is 
enough to prove the same inequality, with N instead of M in the left-hand 
side, since 



(e''" - 1) Vs-^ Psivi 
een 




Condition (23) thus reads 



X M(s,£)(e"^-l)7x,< X N{s,i)ie^^ -l)u 
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(37) 



se7^n>o \ ken ) 



and this difference is indeed nonnegative under the current assumption that 
the Us all have the same sign. 
We thus need to show that 

(38) ^(^>^)(e"= - 1)^^ < E " ^^s- 

s,teTl s,££TZ 

Note now that the marginals of the measure A^(-, •) coincide. Indeed, 



Y.N{s,i) = T. + E E ipi"^^ - P^r 



sen n>o n>osen 



— Z^Prl ■ 

n>0 

Thus, after renormalization of both sides of (38) by the total mass of the 
measure A^, it equivalently reads 

(39) E[(e^-l)T/]<E[(e^ -!)[/], 

where the random variables U, V have the same distributions. An inequal- 
ity due to Hoeffding [12] (see also [28] and [7] for more easily accessible 
references) states that, given two random variables U, V with identical dis- 
tributions, for any two nondecreasing functions f,g:M.^M such that f{U) 
and g(U) have finite variances, one has 

E[f{U)g{V)]<B[f{U)g{U)]. 

Note that the inequality we need to prove is of that form, with as nonde- 
creasing functions f{U) = U and g{U) = — 1. Finiteness of variances is 
trivially satisfied as the random variables U take only finitely many values. 
This concludes the proof of the first claim of the lemma. 

Let us now show that, in order to have equality in (23), all Ug such 

that J2n>oPrs^ > must be zero. Equality in (23) implies equality in (39). 
However, the latter holds if and only the distributions of {f{U),g{V)) and 
{f{U),g{U)) coincide; see, for example, [28]. As the functions /, g are strictly 
increasing, this in turn holds if the distributions of {U, V) and (U, U) coin- 
cide. This means that we can partition the set IZ such that on each subset 
of the partition, the Ug are constant, and for s,^ in two different subsets of 
the partition, N{s,£) = 0. Thus, for all s such that J2n>oPrs^ > must 
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have Ur = Ug. This is needed to ensure equahty in (38) . However, in order to 
ensure equahty in (23), the right-hand side of (37) must also be zero, which, 
using the fact that ah Us coincide, also reads 

rel^wr = -ff^ix) = log(APF(a;))t/,(e"'- - 1) ^ [l - ^ p^l) = 0. 

s&n \ ken ) 

Equivalently, one must have Mj.(exp(ur) — 1) = 0, that is, Uy = 0, which con- 
cludes the proof of the lemma. □ 

Remark A.l. Note that the statement of Lemma 7 remains true if we 
replace the terms [exp(Ms) — 1] by f{us) in (23), where / is any strictly 
increasing function such that /(O) = 0. 
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